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Abstract 

Visceral leishmaniasis or kala-azar is a potent parasitic infection causing death of thousands of people each 
year. Medicinal compounds currently available for the treatment of kala-azar have serious side effects and de- 
creased efficacy owing to the emergence of resistant strains. The type of immune reaction is also to be considered 
in patients infected with Leishmania donovani (L. donovani). For complete eradication of this disease, a high level 
modern research is currently being applied both at the molecular level as well as at the field level. The computa- 
tional approaches like remote sensing, geographic information system (GIS) and bioinformatics are the key re- 
sources for the detection and distribution of vectors, patterns, ecological and environmental factors and genomic 
and proteomic analysis. Novel approaches like GIS and bioinformatics have been more appropriately utilized in 
determining the cause of visearal leishmaniasis and in designing strategies for preventing the disease from spread- 
ing from one region to another. 
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INTRODUCTION 

On the Indian subcontinent, visceral leishmania- 
sis or kala-azar is a fatal vector-borne parasitic dis- 
ease that has increased in incidence over the recent 
decades' 1 " 21 and been considered an anthroponosis. 
Leishmannia spp. (Phlebotomus argentipes), the 
etiological agent of kala-azar, was first recognized in 
India in 1903™. Kala -azar epidemic in India started 
from Assam, spread to West Bengal and reached Bi- 
har more than 100 y ago in the Purnea district" 1 . India 
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contributes more than 80% of the kala-azar cases in 
the South East Asian Region. According to the 2008 
report (NVBDCP, 2008), kala-azar prevalence per 10, 
000 population was estimated to be 3.43% in Bihar, 
0.13% in West Bengal, 0.001% in Uttar Pradesh, and 
1.36% in Jharkhand. A large epidemic of 100,000 
cases of kala-azar occurred in Bihar in 1977 though 
the official figure was 18,589 only (unpublished data 
from the Department of Health, Government of Bihar, 
based on reports from the Primary Health Centers). 
The disease incidence has come down from 77,101 
cases in 1992 to 24,209 cases in 2009 and deaths from 
1,419 to 93, respectively. However, during 2010, the 
recorded cases were 3,344 with 2 deaths up to Feb- 
ruary, 2010. Fig. 1 shows a time series plot of the 
number of people with kala-azar in India based on the 
data from the report by the World Health Organization 
(WHO), 2008. Visceral leishmaniasis has developed 
an epidemic cycle, taking place almost regularly every 
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Fig. 1 Time series plot of the estimated number of 
people with kala-azar in India (adopted from: Report on 
WHO, 2008) 

15-20 years 171 . 

To achieve the elimination goal of the disease by 
year 2015, the Indian government is providing full 
supports apart from regular technical guidance. For 
example, timeliness and quality indoor residual spray- 
ing, complete treatment of patients as well as inten- 
sive social mobilization are being stressed upon. The 
significant factors in favor of making elimination goal 
are tools for early diagnosis, including rapid diagnos- 
tic test rK39, which can be used by trained peripheral 
health workers. However, the overall prevalence re- 
mains to be decreasing despite even relatively minor 
increases in kala-azar infection rate in the country. 
Consequently, proper strengthening of the informa- 
tion system between the district administration, Public 
Health Center (PHC) and village level is needed to 
bridge the gap between the outbreaks of the disease 
and failure to reach the information to the authority. In 
these cases, district authorities must be quicker to re- 
spond to patient needs, arrange better diagnostic tests 
and drugs, as well as implement good housing pro- 
gram. 

More than a century ago, epidemiologist and health 
programmer were instigated to investigate the poten- 
tial of maps for understanding the spatial dynamics 
of disease pattern and their association. Mapping can 
play an important role in both areas as it is an excel- 
lent means of communication. In order to be useful for 
resource planners, prediction of visceral leishmaniasis 
should include a spatial component. It is interesting to 
study and analyze the domain knowledge of remote 
sensing (RS), geographic information system (GIS) 
and bioinformatics and integrate them with the medi- 
cal sciences to understand the advances and gaps. 

RS and geographical bioinformatics system (GBS) 
are new fields of research and development that utilize 
expertise from two well-established fields, geoinfor- 
matics and bioinformatics. The term GBS refers to the 
application of computer science in the management of 



biological information using the spatial and temporal 
maps and databases to collect and study the complex 
patterns and analyze their effects. Researchers have 
agreed to use GIS applications to find and track large 
patterns, for example, geographic distribution of kala- 
azar and other diseases in human, animal and plant 
populations, etc. They are also exploring the poten- 
tial of using RS and GIS to visualize and enhance the 
presentation of bioinformatics data, and to identify 
the opportunities of bioinformatics, and present them 
to the GIS research community. They are also explor- 
ing from another perspective the use of bioinformat- 
ics methodologies in GIS aiming at enhancing current 
GIS techniques, and identifying new approaches for 
pattern recognition and data analysis that could be 
used specifically for RS and GBS purposes. 

The following is a short review of RS and GIS ap- 
plications that analyze and visualize bioinformatics 
data to help bioinformatics experts study biological 
phenomena. These applications represent the solu- 
tion for various bioinformatics databases from RS and 
GIS perspectives. The study is mainly focused on: 1) 
Exploring the future research in this new multidis- 
ciplinary field; 2) The "state of art" in the use of RS 
and GIS applied to endemics on visceral leishmaniasis 
(kala-azar); 3) Critically discussing the potential of RS 
and GIS for kala-azar epidemiology and epidemiolo- 
gists, and some recommendations are presented. 

RS AND GLS APPLICATION ON BI- 
OLOGLCAL PHENOMENA 

RS and visceral leishmaniasis or kala-azar 

Sandfly, P. argentipes (vectors of Indian kala-azar) 
cannot be observed directly. For example, multi- 
spectral, microwave or thermal imagery is not used 
to observe sandfly directly from space, but can be 
used to identify the suitable environment or breeding 
places. The distribution pattern of sandfly and disease 
is highly affected by its physiography and ecology. 
The ecological and environmental factors may con- 
tribute to the transmission of the disease by enhancing 
the physiological activities of vectors and parasite. RS 
refers to the science of identification of earth surface 
features and estimation of their geo-biophysical prop- 
erties using electromagnetic radiation as a medium of 
interaction. RS technologies, which allow the mapping 
of environmental variables, have been used in differ- 
ent epidemiological studies' 8 " 121 , but so far only rarely 
in the context of visceral leishmaniasis. Few studies 
are available that include the extraction of environ- 
mental indicators like meteorology, vegetation and al- 
titude {Table 1). Neto et al. developed an ecological 
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Table 1 Visceral leishmaniasis related to remote sensing 



Vector 


Location 


Satellite/Sensor 


References 


P. papatasi 


SW Asia 


NOAA(AVHRR) 


Cross et al, 1996 [141 


P. oricntalis 


Sudan/ Africa 


NOAA(AVHRR) 


Thompson et al, 1999 1571 


L. chagasi, L. longipalpis 


NE Brazil 


Landsat-5 (TM) 


Thompson et al, 2002 1571 


P. orientalis 


Sudan/Africa 


SPOT 


Elnaiem et al, 2003 1251 


L. longipalpis 


Bahi a/Brazil 


Bio-climatic variable SRTM 


Neito et al, 2006 [851 


P. argentipes 


Bihar /India 


IRS LISS-III 


Sudhakarera/., 2006 [1?1 


L. chagasi 


Teresina/Brazil 


Landsat-5 (TM) 


Netofffl/.,2009 [131 


P. argentipes 


India 


SRTM, NOAA 


Bhunia et al, 201 0 [81 


P. argentipes 


NE India 


NOAA(AVHRR) 


Bhunia et al, 2010™ 


P. alexandri 


Middle East 


NOAA(AVHRR) 


Colacicco-Mayhugh et al, 2010™ 


L. longipalpis 


Brazil 


LANDSAT (TM) 


Werneck &Maguire, 2002 M 


L. spp. 


Brazil 


LANDSAT (TM) 


Aparicio & Dantas, 2003 1871 



niche model to delineate the distribution and potential 
risk zone of visceral leishmaniasis. 

Nevertheless, RS data related to kala-azar stud- 
ies have been focused on medium to low resolution 
satellite data, such as Landsat's Multispectral Scanner 
(MSS) and Thematic Mapper (TM), the National Oce- 
anic and Atmospheric Administration (NOAA)'s Ad- 
vanced Very High Resolution Radiometer (AVHRR), 
and France's Systeme Pour I'Observation de la Terre 
(SPOT). A list of different earth observing satellite 
sensors are provided to introduce readers to the types 
of the sensors that can be used to determine respon- 
sible environmental factors for visceral leishmaniasis 
{Table 2). Most information obtained by satellite is 
correlated to vegetation, utilizing Normalized Dif- 
ferential Vegetation Index (NDVI), an index ob- 
tained from operations with spectral bands. In kala- 
azar epidemiology, utility of high spatial resolution is 
rare; microwave imagery is almost absent. However, 
this technology has been widely used for health is- 
sue monitoring, disease re-emergence explanation, 
prediction and risk map amplification 19 111 by apply- 
ing a combination of weather data and AVHRR-GAC 
data to forecast the geographic and seasonal distribu- 
tion of Phlebotomus papatasi in southwest Asia. A 
computer model was generated using the occurrence 
of P. papatasi, as a dependent variable and the mean 
synoptic weather data (114 meteorological stations) 
as independent variables. Mean monthly NDVI data 
from the NOAA-AVHRR were calculated for the pe- 
riod 1982-1994. The result of the frequency of NDVI 
levels versus the probability of vector occurrence was 
then used to establish NDVI limits for vector presence 
(0.00 + 0.06). This result has provided useful infor- 
mation on the spatial and temporal distribution of P. 
papatasi in the region. 

RS studies in sandfly habitat mapping 



Visceral leishmaniasis transmitted by sandflies 
[Phlebotomus argentipes (P. argentipes)] was re- 
ported as a health hazard for troops deployed in In- 
dia [15] . The spatial distribution of sandflies is poorly 
understood, and knowledge of P. argentipes breeding 
sites remains scanty 1161 . Comparison of P. argentipes 
densities between endemic and non-endemic to vis- 
ceral leishmaniasis with different land use/land cover 
characteristics (LULC) derived from Indian Remote 
Sensing (IRS) Linear Imaging Self Scanning III (LISS 
III) data in North India showed that the endemic ar- 
eas had a higher percentage of waterbody, succulent 
vegetation and high vector density compared to the 
non-endemic areas' 1 ' 1 . The distribution of Ph. Martini 
and Ph. orientalis using NDVI, midday land surface 
temperature, soil, and agro-ecological characteristics 
has been modeled 1181 . This map was able to identify 
accurately all the areas where sandflies were present 
and useful for the health authorities in prioritizing 
their visits to specific sites. Using a predictive model 
of vector density in relation to NDVI, it was observed 
even at a null NDVI index. With increasing values of 
NDVI, the number of sandflies tended to decrease 119 ' 201 
and, similarly, NDVI values have advocated the low 
and high probability of P. papatasi zone 11 

GIS in kala-azar control programme 

The GIS is a special type of information systems 
and consists of hardware, software, data, people, and 
procedures that work together to produce quality in- 
formation. It is the key tool to map the vectors and 
evaluate environmental factors that influence spatial 
and temporal distribution of vector/insects of several 
projects in public health and epidemiology 1211 . One of 
the most useful functions of GIS in kala-azar epidemi- 
ology continues to be its utility in basic mapping 122 " 241 . 
While visual analyses (mapped evidence) and explora- 
tory analyses are by and large adequate for epidemi- 
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Table 2 Earth observing satellite sensors used to determine responsible environmental factors for mapping vis- 
ceral leishmaniasis or Kala-azar 



Leishmania 
species 


Vectors 


Country: area 


Environmental 
parameters 


Satellite: Sensor 


Technique 
Base map 


References 


L.i. chagasi 


Lit. longipalpis 


Brazil: Minas Gerais 


Vegetation, Altitude, 






Margonari et al., 2006 [241 








Hydrographic basin 








L. chagasi 


Lu. longipalpis 


Brazil: Teresina 


Vegetation 


Landsat 5 TM 


NDVI 


Neto et al., 2009 [131 


L.i. chagasi 


Lu. longipalpis 


Argentina: Misiones 


Land cover 


IKONOS 


Supervised 


Fernandez et al, 2010 [tisl 








characteristics 




classification 




L. donovani 


Ph. martini and 


East Africa 


Vegetation Index 


NOAA (AVHRR) 


NDVI, LST 


Gebre-Michael 




Ph. orientalis 




and midday Land 






et al., 2004 [1S1 








Surface Temperature 








L. donovani 


Ph. argentipes 


India: Bihar 


Vegetation index, 


IRS-1C LISS III 


NDVI, Superv- 


Sudhakar et al, 2006 1 " 1 








land cover features 




kpH f*lncciiif*s*tinn 




L. donovani 


Ph. argentipes 


India: Bihar 


Eco-enviromental 


NOAA (AVHRR) 


Supervised 


Bhunia etal, 2010 1 "" 1 








parameters 




classification 




L. donovani 


Ph. argentipes 


India: Bihar 


Vegetation index, 


SRTM, Landsat TM 5 


DEM, NDVI 


Bhunia etal, 2010 1 " 1 








altitude 








L.i. chagasi 


Lu. longipalpis 


Brazil: Bahia: 


Vegetation 


NOAA (AVHRR) 


NDVI 


Bavia et al, 2005 12 " 1 






Sanitarion de Barra 












L. braziliensis 


Argentina: Formosa: 


River, Vegetation 


Landsat 5 TM 


Visual 


Salomon et al, 2006 1271 






Las Lomitas 




Landsat 7 ETM+ 


identification 




L. donovani 


Ph. orientalis 


Sudan 


Vegetation Index 


NOAA (AVHRR) 


NDVI, LST 


Thomson et al, 1999 1431 








and Land Surface 














Temperature 








L. chagasi 


L. longipalpis 


Brazil: Ceara: Caninde 


Vegetation Indices, 


Landsat TM 


NDVI, TC, Unsu- 


Thompson et al, 2002 [5?I 








Land cover 




pervised classific- 










characteristics 




ation (ISODATA) 




L. chagasi 


Lu. longipalpis 


Brazil: Bahia 


Altitude and Climate 


WorldClim 


Interpolation of 


Nieto et al, 2006 1 " 51 
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(GTOPO 30) program 


weather stations 




L. major, 


P. Papatasi, 


Middle East 


Elevation, 


AVHRR (NOAA) 


Ecological Niche 


Colacicco-Mayhugh 


L. donovani 


Paraphlebo- 




precipitation, 




Model 


et al, 2010 1861 




tomous 




land cover, and 










alexandri 




WorldClim 














bioclimatic 








L. infantum 


Lu. migonei 


Argentina: La Banda, 




Google Earth 


Map 


Salomon et al, 2010™ 






Santiago del Estero 






visualization 




L. donovani 


P. papatasi 


SW Asia 


Vegetation, Weather 


NOAA(AVHRR) 


Computer 


Cross et al, 1996 [111 








data 




modeling using 














AVHRR-GAC 














data 




L. donovani 


Ph. orientalis 


Africa: Sudan: 


Vegetation status, 


USGS data(hydrology, 


DEM, Slope, asp- 


Elnaiem et al, 2003 [251 






Gedaref State 


Wetness Index, 


topography) SPOT 


ect, compound 










Altitude 




topographic index 














flow accumulat- 














ion, NDVI 





ologists, the formal testing of certain hypotheses or 
the estimation of relationships between measures of 
disease incidence and environmental covariates require 



quantitative modeling of disease distribution (Fig. 2). 

GIS applications related to kala-azar have been in- 
troduced and used in the surveillance and monitoring 



Geographical bioinformatics system and Kala-azar 



377 



Epidemiology Data 



Entomology-spatial 
distribution of sandfly 




•Administrative 
boundary 

•Location of Health 
Centres 

.Major Environmental 
Features(River, marshy 
land, soil etc.) 



• Socio-economic 
data 

• Physiographical 
data (altitude, 
vegetation, land 
cover etc.) 

• Climatic Data 
♦Temperature, 
Humidity, Rainfall 




MODELS 
• Statistical Model 
•Analytical Model 
^'Simulation Model . 




Fig. 2 Decision support system of kala-azar disease analysis through remote sensing and GIS technique 



of diseases 11 " 1 , in environmental health 18 ' 17 ' 251 , quan- 
tifying environmental hazards and their influence on 
public health 126 " 281 , and for policy and planning pur- 
poses 129 " 311 . In India, GIS systems have been used in 
vector control research 132 " 341 for studying and mapping 
of non-communicable diseases 1 ' 35 " 3 ' 1 . Geographical 
epidemiological studies, in which health and environ- 
mental exposure data are analyzed in fine geographi- 
cal detail, represent an important new approach 1 ' 381 . 

The aims and purposes of disease mapping in the 
context of kala-azar are: 1) To describe the spatial 
variation in disease incidence for the formulation of 
etiological hypotheses; 2) To identify areas of unusu- 
ally high risk in order to take preventive action; 3) To 
provide a reliable map of disease risk in a region to 
allow better resource allocation and risk assessment. 

GIS may also involve more sophisticated spatial 
analysis of disease occurrence and contributing en- 
vironmental factors. However, the spatial and geo- 
statistical analysis capability of GIS is argued to be 
rather limited and, therefore, the value of GIS for 
spatial epidemiology should be critically assessed 
and discussed. For example, the display of kala-azar 
cases vis-a-vis sandfly density of kala-azar vector, P. 
argentipes, and locations on a vegetation layer 181 may 
reveal the association between cases or sandfly and 
the distribution of vegetation covers 1391 used spatial 
analysis of visceral leishmaniasis cases in northern 
Brazil to identify proximity to forests and pastures as 
the major risk factors. 



Research perspective: RS, GIS, and kala-azar 

Over the past decade, the focus of some of this as- 
sistance has been in the provision of GIS hardware, 
software and training. In theory, GIS can be a very 
effective tool in combating visceral leishmaniasis or 
kala-azar; however, in practice there have been a host 
of challenges to its successful use. 

Mapping visceral leishmaniasis (kala-azar) incidence/ 
prevalence 

This is the most basic application and involves 
mapping the incidence/prevalence of visceral leish- 
maniasis (kala-azar) over some geographic area 1401 . 
The focus is on examining past trends as well as the 
present situation 1411 , and typically does not include any 
statistical analysis with the possible exception of cor- 
relating visceral leishmaniasis (kala-azar) incidence/ 
prevalence with population in order to calculate popu- 
lations at risk. The goal with these studies is to see if 
any obvious patterns exist or not. 

Mapping of relationships between visceral leishma- 
niasis (kala-azar) incidence/prevalence and other po- 
tentially related variables 

The timeframe is still on past trends and the present 
situation. The goal of these studies is to see if any re- 
lationships exist between visceral leishmaniasis (kala- 
azar) incidence/prevalence and a host of other variables 
including: temperature, rainfall, etc. 142 " 451 ; land use/ 
land cover; elevation; demographics (age and gender); 
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population movement 18 ' 1 '' 46 ' 471 ; climate change 148 ' 491 ; 
breeding sites and control programs' 1001 ' 521 . In 
most cases, these studies involve testing to see if any 
statistical relationships exist. 

Using innovative methods of collecting data 

The most important limitations of GIS are data 
gathering. The literature mostly deals with RS in the 
form of aerial photography and satellite imagery, but 
they have not mentioned the scale and the truthfulness 
of data . Currently, various GIS data are based on 
different scales. When those GIS data are overlaid 
with other GIS data in different scales, many prob- 
lems will be encountered in the accuracy domain' 51 . 
As noted earlier, spatial statistical analysis is a newly 
developing field and has no agreed upon or standard 
methodologies [5fi " 581 . 

Modeling visceral leishmaniasis (kala-azar) risk 

Modeling approaches have proven to be highly 
relevant in kala-azar studies that include either induc- 
tive/empirical models or deductive/theoretical models. 
There is a need for tailored capacity-building activi- 
ties and attention to performance-based programs ex- 
pands its response to the epidemic in the direction for 
a stronger multi-sector response. Risk models typi- 
cally use many of the same variables discussed above- 
the difference being that statistical relationships are 
established between visceral leishmaniasis (kala-azar) 
incidence/prevalence (the dependant variable) and a 
range of independent variables in an effort to predict 
future cases of visceral leishmaniasis (kala-azar) 1591 and 
develop multilevel modeling at different spatial scales 
to investigate disease transmission. Neto et al.™ have 
developed two predictive ecological niche models 
within a geographic information system using genetic 
algorithm rule-set prediction (GARP) and the grow- 
ing degree day (GDD)-water budget (WB) concept to 
predict the distribution and potential risk of visceral 
leishmaniasis in the State of Bahia, Brazil. 

What RS and GIS software is being used? 

The referred literature is probably not the best place 
to get an idea of the type of RS and GIS software that 
is used by those dealing with visceral leishmaniasis 
(kala-azar) research and control. This is because the 
software used by visceral leishmaniasis (kala-azar) re- 
searchers is typically different from that used by public 
health practitioners. This is the most important issue 
dealing with the problem that most image process- 
ing software [ERDAS Imagine, Multispec©, GRASS 
(www.grass.itc.it), MrSID GeoExpress View (www. 
lizardtech.co)] typically originates from the United 
States or Europe. In some cases, this results in prob- 



lems getting copies of the software as well as getting 
support for the software. Some GIS software like Arc- 
GIS® (ArcMap, ARC/INFO), Integraph GeoMedia® 
(www.integraph.com), HealthMapper and research 
analyst disseminate the use of GIS as tool for analysis 
and problem solving. The package offers simplified 
tools and interfaces to efficiently carry out bio-sta- 
tistical and geographical analysis to support decision- 
making in kala-azar control program strategy. Such 
information when mapped together creates a powerful 
tool for monitoring and management of disease and 
other public health programs. 

Bioinformatics and GIS 

Bioinformatics, RS and GIS are the upcoming fields 
not only in disease epidemiology but also in every 
field of life. The application of these technologies has 
been documented and applauded for their accuracy, 
rapidity and economic considerations. Bionformat- 
ics is the science that deals with the collection, or- 
ganization and analysis of large amount of biological 
data using advanced information technology such as 
networking of computers, software and databases 1601 . 
It expands simple searches of computer databases 
into new ways to merge data and divulge answers to 
complex queries in environmental studies, molecular 
biology, and climate change. Bioinformatics will alter 
epidemiological studies, as cataloguing of collections 
is completed and methods are developed to investigate 
associations among different datasets. Researchers in 
bioinformatics generally look at very small patterns, 
motif, genomic map and structure that might predis- 
pose an organism 1611 whereas, GIS tools can be used 
by researchers to find and track large patterns, for ex- 
ample, geographic distribution of kala-azar 1621 , show- 
ing the relationship between location of disease and 
land cover, or soil types, in the future. 

Bioinformatics and GIS have much in common, 
most notably large dedicated databases, visualiza- 
tion technique, pattern recognition, digital maps and 
analysis. Researchers are facing dreadful challenges 
in trying to recognize and analyze meaningful patterns 
in the rapidly growing volumes of data and informa- 
tion 16 ' 1 . Both disciplines rely heavily on the use of 
maps (genomic and proteomic maps in case of bion- 
formatics and geographical maps in case of GIS) for 
abstract representations of data. The utilization of bi- 
onformatics methodologies in geographical informa- 
tion system aims at enhancing current GIS techniques 
and identifying new approaches to recognize pattern 
and data analysis that could be used specifically for 
GBS purposes' 641 . For dynamic representation, GIS 
methodologies can be applied for efficient biological 
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database management while developing a database. 
The marine microbial diversity database provided GIS 
option with an interface for selecting a particular sam- 
pling location along with genome sequences and their 
details denote the fusion of GIS with bionformatics' 651 . 
In GIS, provenance information includes depiction of 
the lineage of the data product including description 
of the data source, the transformations used to derive 
it, reference to the control information and mathemati- 
cal transformations of the coordinates' 661 . Lineage 
Information Program (LIP) follows a data-oriented 
provenance technique for GIS and is used for infor- 
mational purposes, update stale data, regenerate and 
compare data 167 ' 681 . We consider that if this lineage in- 
formation is stored and recorded in the machine read- 
able as bionformatics, it can be applied for RS data 
dissemination and management to realize a functional 
model of kala-azar disease. Monitoring, quantifying, 
and predicting the human-health consequences related 
to kala-azar through environmental management of 
biological information uses the spatial and temporal 
maps, and databases to collect and study the complex 
patterns and analyze its effects. 

In the past, GIS and spatial epidemiology focused 
on finding and monitoring large-scale, population- 
based occurrences such as disease clusters, outbreaks 
of infection, or possible associations between vector, 
pathogen and environmental factors. The amalgama- 
tion of genomics and proteomics with GIS and spatial 
epidemiology has the potential to provide an immense 
breakthrough. This will permit us to do a far better 
job of monitoring, quantifying, and predicting human- 
health consequences connected with the environment. 

Software has been developed as tools of bionfor- 
matics to analyze micro-level data like nucleotide 
or amino acid sequence data and extract biological 
information. Gene prediction software (http://www. 
scfbio-iitd.res.in/chemgenome/chemgenomenew.jsp 
and http://www.genomethreader.org/) and 'Sequence 
alignment software' (http://blast.ncbi.nlm.nih.gov/ 
Blast. cgi & http://www.ebi.ac.uk/tools/clustalw2) are 
examples of some of the software developed for bi- 
onformatics. Multiple sequence alignment of KMP11 
amino acid sequences from seven various Leishmania 
strains are shown in Fig. 3. Gene prediction software 
is used to identify a gene with a long gene sequence. 
Efficient software like ArcGIS may become useful if 
it is utilized for visualizing, analyzing and querying 
the genome database better than the presently avail- 
able techniques' 691 . Thus, GIS function development 
technique can be replicated to make the available 
software more efficient. A general concept, method- 
ology and tool was developed by Dolan et al. l69] for 



the display of geographic data to develop a Genome 
Spatial Information System (GenoSIS) for spatial 
display of genomes. Schweizer' 701 showed the spatial 
aspects of bioinformatics through imaging and image 
processing tools that could be used for pattern recog- 
nition and analysis. FBK (http://mpba.fbk.eu/en/home) 
developed novel mathematical methods and ICT plat- 
form that may hook up the physiographical patterns of 
disease with high dimensional data, now available for 
functional genomics (e.g. DNA microarrays, SNPs, 
proteomics, deep sequencing) with spatially epidemic 
simulation system and geo-databases of environmen- 
tal factors and socio-demographic data. Programs like 
GenoSIS and tools of FBK put forward the step in the 
integration of GIS and bionformatics, which can be 
used for bringing dynamism in the database search 
and predictive models on complex spatio-temporal 
patterns. Moreover, another aspect of genome research 
is molecular modeling and 3-D visualization. The 
maps developed by X-ray crystallography and nuclear 
magnetic resonance (NMR) spectroscopy techniques, 
which are preserved in the form of a Protein Data 
Bank (PDB), aid to solve protein structure determina- 
tion. Similarly, in GIS, ArcScence™ could be used for 
3-D visualization, modeling and analysis of spatially 
distributed PDB data' 711 . 

Much like bioinformatics, geoinformatics takes the 
swift processing power and algorithms of computer 
science to organize information effectively. Hence, the 
goal is not only just to store data, but to gain knowl- 
edge from biological research with special reference to 
visceral leishmaniasis. However, data in bionformat- 
ics are of spatial nature and could be well understood 
if symbolized, scrutinized and figured out as such 
geospatial data. In bioinformatics study, geoinfor- 
matic can be interactively used for better dynamism, 
versatility and efficiency. This assists in managing the 
genome data interactively with the application of su- 
perior GIS functionality. 

Bioinformatics and drug discovery: a hope for 

controlling kala-azar 

Bioinformatics is the key to rational drug design. 
Most drug targets are proteins, so it is important to 
identify their 3-D structure in detail. Genome projects 
generate sequence information at a much higher rate 
than NMR and X-ray laboratories' 721 . Small fraction 
of known X-ray or NMR structures enhances homol- 
ogy modeling to predict 3-D structure based on its 
template by using bionformatics software. MODEL- 
LER (http://www.salilab.org/modeller/) and SWISS- 
MODEL (http://swissmodel.expasy.org/) are some 
well-known tools. A number of modeled structures 
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Fig. 3 Multiple sequence alignments of KMP11 protein sequences from seven different Leishmania strains gen- 
erated with the program of Clustal W program' 921 . It represents conserved sequence regions colored by residue across a 
group of sequences hypothesized to be evolutionarily related. The interactive highlighting lower part of the image in yellow and black 
color shows the conservation, quality and consensus region corresponding to the amino acid or codon throughout the sequences. 



of proteins of Leishmania have shown a better choice 
for obtaining three-dimensional coordinates for pro- 
teins 1 ™ 1 . Bionformatics based virtual screening analy- 
sis has shown that sulphoraphane, an anticancer com- 
pound, has been implicated in the treatment of kala- 
azar and 3-D model of SIR2 in L. major have shown 
several potentially important structural differences in 
the nicotinamide binding catalytic domain 1 ' 41 . These 
reflective applications and findings of bioinformatics 
has led to a cost and time effective research. Software 
like Insight II (now incorporated in discovery studio 
(Accelrys)) and high power computing workstations 
reduced the number of trials in the screening of drug 
compounds and helpful to identify potential drug 
targets for a specific disease. This initiative aims to 
identify the functions of different proteins and predict 
their structures so that these could be used as potential 
targets for developing drugs against Leishmania. 

Computer-aided drug design (CADD) is another 
exciting and diverse discipline where various aspects 
of applied and basic research combine and stimulate 
each other to discover, enhance and study drugs and 
related biologically active molecules" 51 . CADD meth- 
ods are heavily dependent on bioinformatics tools, ap- 
plications and databases. It is highly important to find 
out the exact compound that strongly binds to the tar- 
get. Virtual high- throughput screening (vHTS) is one 
of the methods to search the molecules 1 ' 61 . All protein 
targets can be screened against databases of small- 
molecule compounds. If there is any hit, the particular 
compound can be extracted for further testing. Virtual 
high-throughput screening is expected to increase the 
impact of virtual screening in the drug discovery proc- 
ess 1 . Another challenge is to find promising leads. 
After the appropriate lead compound is obtained, it is 
essential to optimize the structure and properties of 
the potential drug. Usually, optimization includes a 



number of modifications of the associated compound 
that explore the lead candidate. Lead optimization 
tools such as WABE offer a rational approach to drug 
design (http://demo.eyesopen.com/about/news/press_ 
releases/2004/Wabel3.html). Swiss-PDB is another 
excellent tool that can predict key physicochemical 
properties, such as hydrophobicity and polarity that 
have a profound influence on how drugs bind to pro- 
teins (http://spdbv.vital-it.ch/). For avoiding failure 
of drug candidate numerous bioinformatics softwares 
are available such as Insight II, Cerius and Discovery 
studio, which can predict the toxicity (ADMET) and 
bioactivity 1 ' 81 . The predictive power of bionformatics 
software helps to choose only the most promising drug 
candidates against Leishmania. 

Role of bioinformatics to put forward func- 
tional genomics in Leishmania 

Success in decoding the genetic blueprint has led to 
the post-genomics era, where there is a need for a fu- 
sion of information technology with biomedicine. The 
post-genomic challenge, mostly in the case of human 
pathogens, is to decode new information in relation 
to genes, their control pathways, proteins, and their 
interactions into improved healthcare. Development 
of several bioinformatics approaches and methodolo- 
gies leads to the discovery of protein-coding genes" 9 " 801 . 
Algorithms to find out protein-coding genes are mainly 
based on similarity (extrinsic methods) and algorithm 
(intrinsic methods) 1811 . Each algorithm is designed 
mainly to detect true positives and to exclude false 
positives. 

Modern machine learning methodologies focus on 
solving computational problems in molecular biology. 
Support Vector Machine (SVM), Genetic Algorithm 
(GA), Artificial Neural Network (ANN) and Hid- 
den Markov Model (HMM) are machine learning ap- 
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preaches that are used for classification and regression 
analysis of high throughput biological data such as 
gene expression data 1821 . Machine learning approaches 
are widely used for gene finding, motifs detection 
and sequence evolution, statistical genetics, genetic 
polymorphisms (genotypes), single-nucleotide poly- 
morphisms (SNPs), comparative genomie hybridi- 
zation (CGH) and their analysis. Consensus epitope 
predictions from more than 8,271 annotated protein 
sequences of L. major with 5-8 different algorithms 
including some machine learning algorithms allowed 
the identification of 78 class I CD8 + epitopes and have 
opened opportunities for the identification of targets 
for vaccine development 1831 . Hidden Markov models 
and the Viterbi algorithm are also applied to integrated 
bioinformatics analyses of putative flagellar actin- 
interacting proteins in Leishmania species' 841 . These 
are some bright examples of how machine learning 
approach and bionformatics analysis move forward 
the functional genomics of Leishmania species. 

Bioinformatics tools and techniques learned are now 
being applied to comparative genomics of Leishma- 
nia. Finding out the reason of such diverse diseases 
by different Leishmania strains have been an elu- 
sive and striking goal for many parasitologists. Com- 
parative genomics by using bioinformatics tools and 
techniques may provide new hints to understand the 
disease causing mechanisms of a range of species. 

Future of RS and GBS 

The speed of computation is rising due to the de- 
velopment of the tools and techniques of bionfor- 
matics and geoinformatics for the management and 
analysis of biological data. The combination of several 
bioinformatics databases, software and visualization 
techniques of GIS may be helpful to identify biologi- 
cal trends and relationships. The integration of vari- 
ous data sources such as clinical, epidemiological, 
genomics and proteomics data will allow kala-azar 
specialists to use disease symptoms related to kala- 
azar to predict and choose better future medicines. 
GBS can boost the discovery rate in bioinformatics, 
in large-scale comparative genomics. Intelligent GBS 
techniques can compute fast, exact and error-free re- 
sults when utilized in existing bioinformatics system 
like monitoring, quantifying and predicting the impact 
of environment phenomena (type of weather changes, 
and emerging infectious diseases) on human-health. In 
public health, GIS can play a vital role to resolve is- 
sues that required spatial analysis and spatial attention. 
Hence, kala-azar specialists along with epidemiolo- 
gists will now be able to predict disease outbreaks by 
looking at the scenario of spatial data (GIS data) and 



bioinformatics data. 

CONCLUSION 

Presently, both GIS and bioinformatics tools should 
try to hybridize and develop a programe for better 
representation of the genome at a higher level of or- 
ganization, i.e. chromosome and cellular DNA. GIS 
tries to look at the DNA at the macro level whereas 
the bioinformatics softwares try to look at the genome 
at the micro level. Future clinicians targeting for bet- 
ter medicine for any particular disease may be able 
to look at the changes of the genome of pathogen by 
collaborative geo-bioinfo-softwares. Hence, future 
clinician may be able to prescribe appropriate drug 
to a patient rather than putting the patient in adverse 
situation which is caused by presently used medicines. 
Algorithms used for geographical information analysis 
by GIS software may be implicated in analysis of bio- 
logical information; hence, specific tools might have 
to be designed for it. 
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